TritonCPUConfig()

Generates a configuration for running Towhee pipelines with a Triton inference server on the CPU. See Towhee Pipeline in Triton for details.

TritonCPUConfig(num_instances_per_device=1, max_batch_size=None, batch_latency_micros=None, preferred_batch_size=None)

Parameters

num_instances_per_device - int
- Number of instances per CPU.
- The value defaults to 1, indicating that there is one model instance running on the CPU.
max_batch_size - int or None
- A maximum batch size that the model in the pipeline supports for the types of batching that can be exploited by Triton. See Maximum Batch Size for details.
- The value defaults to None, leaving Triton to generate the value.
batch_latency_micros - int or None
- Latency for Triton to process the delivered batch, in microseconds.
- The value defaults to None, leaving Triton to generate the value.
preferred_batch_size - list[int] or None
- A list of batch sizes that the Triton should attempt to create.
- The value defaults to None, leaving Triton to generate the value.

Returns

A TowheeConfig object with server set to a dictionary. The dictionary contains the specified parameters and their values with device_ids set to None.

Examples

from towhee import pipe, ops, AutoConfig

auto_config1 = AutoConfig.TritonCPUConfig()
auto_config1.config # return {'server': {'device_ids': None, 'num_instances_per_device': 1, 'max_batch_size': None, 'batch_latency_micros': None, 'triton': {'preferred_batch_size': None}}}

# or you can also set the configuration
auto_config2 = AutoConfig.TritonCPUConfig(num_instances_per_device=3,
                                          max_batch_size=128,
                                          batch_latency_micros=100000,
                                          preferred_batch_size=[8, 16])
auto_config2.config # return {'server': {'device_ids': None, 'num_instances_per_device': 3, 'max_batch_size': 128, 'batch_latency_micros': 100000, 'triton': {'preferred_batch_size': [8, 16]}}}

# you can also add the configuration
auto_config3 = AutoConfig.LocalCPUConfig() + AutoConfig.TritonCPUConfig()
auto_config3.config # return {'device': -1, 'server': {'device_ids': None, 'num_instances_per_device': 1, 'max_batch_size': None, 'batch_latency_micros': None, 'triton': {'preferred_batch_size': None}}}

Parameters​

Returns​

Examples​

Parameters

Returns

Examples